Previous studies have shown that spike-timing-dependent plasticity (STDP) canbe used in spiking neural networks (SNN) to extract visual features of low orintermediate complexity in an unsupervised manner. These studies, however, usedrelatively shallow architectures, and only one layer was trainable. Anotherline of research has demonstrated - using rate-based neural networks trainedwith back-propagation - that having many layers increases the recognitionrobustness, an approach known as deep learning. We thus designed a deep SNN,comprising several convolutional (trainable with STDP) and pooling layers. Weused a temporal coding scheme where the most strongly activated neurons firefirst, and less activated neurons fire later or not at all. The network wasexposed to natural images. Thanks to STDP, neurons progressively learnedfeatures corresponding to prototypical patterns that were both salient andfrequent. Only a few tens of examples per category were required and no labelwas needed. After learning, the complexity of the extracted features increasedalong the hierarchy, from edge detectors in the first layer to objectprototypes in the last layer. Coding was very sparse, with only a few thousandsspikes per image, and in some cases the object category could be reasonablywell inferred from the activity of a single higher-order neuron. Moregenerally, the activity of a few hundreds of such neurons contained robustcategory information, as demonstrated using a classifier on Caltech 101,ETH-80, and MNIST databases. We also demonstrate the superiority of STDP overother unsupervised techniques such as random crops (HMAX) or auto-encoders.Taken together, our results suggest that the combination of STDP with latencycoding may be a key to understanding the way that the primate visual systemlearns, its remarkable processing speed and its low energy consumption.
展开▼